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INTRODUCTION 

Cinnamon or known as ‘true cinnamon’ is 
native to Sri Lanka and southern parts of the 
India. Cinnamaldehyde, eugenol, and 
linalool are the three main components of 
the essential oils obtained from the bark of 
Cinnamon, these components represent 
82.5% of the total composition [1]. Trans 
for 


cinnamaldehyde, accounts 


approximately 49.9-62.8% of the total 
amount of bark oil [2,3]. Two more major 
components of cinnamon extracts are 
cinnamaldehyde and eugenol [4]. There are 
cinnamon, 


(CZ) 


two main __ verities of 


Cinnamomum — zeylanicum and 
Cinnamon cassia (CC). These verities have 
basic difference in their coumarin (1,2- 
benzopyrone) content [5]. The levels of 
coumarins in CC seem to be very high and 
can cause health risk if consumed regularly 
in higher quantities. According to the 
Federal Risk 


Assessment (BFR), 1 kg of CC powder 


German Institute for 


contains approximately 2.1-44 g of 
coumarin, which means 1 teaspoon of CC 
powder would contain around 5.8-12.1 mg 
of coumarin. Above given is the TDI 
(Tolerable Daily Intake) for coumarin if 0.1 
mg/kg body weight/day which was 
recommended by the European Food Safety 
Authority (EFSA) [6]. The BFR reports 
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precisely states that CZ contains ‘hardly 
any’ coumarin. Coumarins are secondary 
phyto-chemicals with strong anticoagulant, 
carcinogenic and having hepato-toxic 
properties [6]. The fundamental mechanisms 
for the coumarin content-related toxic 
effects are yet to be completely clarified. CC 
contains high concentration of coumarin 
than any other foods. Studies have shown 
that coumarin coverage from _ food 
consumption is mainly due to CC. Currently 
available evidences shown that coumarin 
does not appear to play any direct role in the 
observed biological effects of CC. However, 
CC variety has been shown many beneficial 
pharmaceutical properties [6,7]. Numerous 
beneficial health effects of CZ have been 


confirmed through in-vitro and_ in-vivo 


studies in animals. They have anti- 
inflammatory properties, reducing 
cardiovascular disease, anti-microbial 


activity, boosting cognitive function and 
reducing risk of colonic cancer. Cinnamon 
has been also mentioned in chinese texts as 
long as 4,000 years ago, it is one of the 
oldest herbal medicines known [8]. 
EST-SSR (expressed sequence tag- 
simple sequence repeat) is a new developed 
molecular marker based on the expression 
sequence of microsatellites. This technology 


has attained the advantage of avoiding the 
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construction steps of genomic DNA library 
in SSR development process; it gives the 
exact marker involve in gene function and 
shows similarity in genomic functional area. 
EST-SSR explains the phenotypic difference 
based on its polymorphisms. This EST-SSR 
are highly conserved within the species as it 
is a part of gene which leads to make the 
primers more commonly used among the 
species. Therefore for the development of 
SSR markers, these EST sequences act as 
valuable resources. In the recent years, 
several studies revealed that there are vast 
numbers of ESTs accumulated as the result 
of deep research analysis on different 
species. These accumulated EST data 
provides a platform in the development of 
SSR markers [9-11]. 
Various projects on sequencing or ESTs 
generates large amount of DNA sequence 
data which can be easily accessible to 
public, it carries both genic (EST) and 
genomic sequences which can be further 
used in the development of markers such as 
SSRs, SNPs. etc. The presence of any 
marker type from such data which can easily 
accessible leads to the generation of markers 
in cheap cost, like if SSRs are present in the 


genic sequence, they called as EST-SSRs 
[9]. 
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The EST-SSR markers are associated with 
the genes carrying them as once they 
mapped. They also act as a valuable source 
of functional markers. Thus the formation of 
EST based SSR markers is a_ cheap 
alternative as compared to conventional SSR 
development method. In genome analysis of 
sorghum these EST-SSRs play a major role 
in producing lasting insight into processes 
by which novel genotypes are generated, 
such advantages helps in the applications of 
crop breeding programs [9-11]. 

The Conventional method of _ the 
development of SSR marker is tedious and 
costly. Therefore, the availability of genic 
EST sequence or genomic sequence in open 
public databases and _ availability of 
bioinformatics tools, the development of 
SSR marker is becoming now low cost and 
easier [12]. Although, previously several 
SSR markers were already generated by 
using EST databases in several crops. For 
the diversity analysis the EST-SSR markers 
were widely used in several crops like: 
wheat [10,13,14], barley [9], in mapping of 
barley [11, 15], pearl millet [16] and finger 
millet [17]. The genomic SSRs derived from 
the transcribed regions of the genome are 
more polymorphic as compare to the EST 


derived SSR markers [15, 12]. 
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In the terms of cross-species transferability 
the EST-SSR markers are very superior, 
because they were derived from the most 
conserved regions of the genome which are 
very useful in the application of comparative 
genome mapping and phylogenetic analysis. 
EST-SSR markers developed in a small 
number (30) in sorghum with wheat, rice 
and maize [18]. These markers have also 
shown very transfer rate in several crops 
system. In wheat the EST-SSR markers 
developed showing 62% _ transferability 
across the all four species barley, maize, 
wheat and rice. EST-SSRs showing 40% 


transferability rate from barley to rice [11, 


15]. 


MATERIALS AND METHODS 


Development of EST-SSR markers: In the 
improvisation of species, molecular markers 
are prominently used, they help to identify 
the polymorphisms, mating system 
parameters, marker-assisted selection and 
genotype characterization. Finally EST-SSR 
was constructed for cinnamon as we found 


there was no EST-SSR developed till date. 
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Recognition of EST sequences 

Firstly, the EST sequences of cinnamon 
were retrieved in FASTA format from the 
NCBI _https:/Avww.ncbi.nlm.nih.gov/) i.e. 


National Centre for Biotechnology 
Information advances science & health by 
providing access to biomedical & genomic 
information. After that the MISA 
(http://webblast.ipk-gatersleben.de/misa/) 


the 


web 


was used _ for recognition and 
determination of the ideal microsatellite also 
compound microsatellites which are fitful by 
the certain number of bases from the ESTs 
recognized from the NCBI followed to 


design the primers at microsatellite loci. 


MISA 
In the plant genetics and the forensic science 
the microsatellites are prominently used 
marker system. The challenge is to make 
microsatellites 


MISA is 


from re-sequencing data. 
a web based computational 
tool the 


application which help in 


development of microsatellite markers. 


MISA web can be accessed by this link 
http://misaweb.ipk-gatersleben.de/. A 25 
years ago microsatellites were rise and still it 
was a most common genetic marker using in 
plant breeding and plant genetics and 
forensics science, where it is generally 


known as simple sequence repeats (SSRs) or 
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short tandem repeats (STRs). In the 
microsatellite the basic structural block is 
the short sequence motifs present between 
one and six pairs in length which is repeated 
in tandem, by high throughput sequencing 
data or Sanger method these characteristics 
can be easily detected by giving in-silico 
approach nucleotide 


[10,12]. 


using sequences 


Pre-processing of the FASTA sequences 

The retrieved FASTA sequence was pre- 
processed first by the help of software 
CAP3 
which 


named 

(http://doua.prabi.fr/software/cap3) 
was freely available on web server, it 
identify the non-redundant EST sequences. 
The CAP3 software runs algorithm which 
overlaps between the sequences and further 
join the reads in the decreasing order to form 
contigs. After the pre-processing of FASTA 
sequences CAP3 gave two files ie. Contigs 
and Single tone which was further processed 


separately [12]. 


Selection of candidate EST sequences 

The non-repeated SSR containing EST 
sequences of Cinnamon were used for 
homology search by using Basic Local 
(BLAST) tool 
available in the NCBI. From all the BLAST 


Alignment Search Tool 
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hits we identified an appropriate EST giving 
the maximum score was selected, followed 
by recognition of homologous genomic 
For the analysis 


region. of complete 


coverage across the genome 


BLAST were performed. 


sequence 


Primer Designing 

The selected contigs (SSR containing ESTs 
sequences) and the single tone were used to 
design primer pairs by using primer3 
(http://biotools.umassmed.edu/bioapps/prim 
er3_ www.cgi). The Primers were designed 
in such a way that they follow such 
conditions: primer length (min-7Ont, opt- 
160nt, max- 250nt), Tm (min-54°C, opt- 
57°C, max-60°C) & GC content (min-45%, 
opt-50%, max- 60%) [12]. 


BLAST 

BLAST is a most common local alignment 
tool (Basic Local Alignment Search Tool) 
founded by Altschul. It is based on a set of 
algorithms in which a fragment of query 
sequence that aligns with the fragment of 
subject sequence present in the database. 
The initial alignment should be greater than 
threshold (T). The 


alignments can be extended in both the 


neighbour score 


direction till the score aligned segment is 


increase. 
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There were two alignments global and local. 
The global sequence approaches are used to 
compare the whole sequence with the other 
full sequences. In the local method the part 
of the sequence is align with the other part 
of the sequence. The global alignment gives 
comparison of one to other sequence, local 
alignment shows higher similarity in the 
regions but lack the ability of comparison of 
two sequences. While comparing small 
group of sequence global approach is very 
useful as the comparison of sequences 
The local 


increases the cost increases. 


alignments are based on __ heuristic 
programming approach that is very suitable 
for very large databases, but they do not 
provide give optimum solution. This 
limitation plays a major role in the genomics 
as they uncover regions of similarity that are 


correlated by two diverse sequences. 


SWISS MODEL - ExPASy 
ExPASy is the bioinformatics resource 
portal which gives a key to open scientific 
databases and software tools in other aspects 
of life sciences. It carries some useful tools 
like SWISS MODEL, UNIPROT, PROSITE 
and STRINGS WISS-MODEL it is a fully 
structure 


developed protein homology 


modelling server access by the ExPASy web 
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server. This server is used to make protein 
modelling accessible to the all researchers of 


life sciences worldwide; on the basis of 


FASTA sequence it provides the 3D 
structure of proteins. 

RESULTS AND DISSCUSSION 

MISA (Microsatellite Identification 
Search Tool) 

The EST sequence which retrieve from the 
NCBI database, the CAP3, and MISA 


software is used for cinnamon plant, MISA 
gave the following results as discussed 


below:- 


Distribution frequency of repeat units for all 


the SSR’s in Cinnamon 
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RESULTS OF MICROSATELLITE SEARCH 


Total number of sequences examined: 

Total size of examined sequences (bp): 

Total number of identified SSRs: 

Number of SSR containing sequences: 

Number of sequences containing more than 1 SSR: 
Number of SSRs present in compound formation: 


Distribution to different repeat type classes 


Unit size Number of SSRs 


1 692 
2 161 
3 270 
4 i i 
3 1 

6 4 


Based on the results obtained from the cinnamon MISA 
analysis, In total 1139 SSRs were identified: out of which 692 
repeat unit were mononucleotides repeats, 161 repeat unit for 
dinucleotide repeats, 270 repeat unit for trinucleotide repeats, 
11 repeat unit for tetra nucleotide repeats, 1 repeat unit for 
penta-nucleotide repeats, and 4 repeat unit for hexa-nucleotides 


repeats. 


hoe me tn ~) 
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UNIT SIZE 


m Number of SSRs__& Percentage 


0.96% 11 0.08% 1 0.35% 4 


Figurel. Distribution frequency of repeat units for all the SSRs 


in Cinnamon. 
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Figure 2: Distribution frequency of nucleotides for all the 


SSRs in Cinnamon. 


The figure 2 indicated that A/T mononucleotide was the most 
common repeat among all SSR motifs, while the Most common 
dinucleotide motif was AG/CT. Among the tri nucleotide 
repeats AAT/ATT was most common; in tetranucleotide SSR 
motifs AAAT/ATTT was most common. The pentanucleotide 
motif was AAAGC/CTTTG, in hexanucleotide SSR motif 
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AAGCAG/CTGCTT was common. These results were similar 


to previous studies [12]. 


CAP3 


Pre-processing of the EST sequences downloaded from the 
public domain were carried out by CAP3 software. By using 
the Cap3 program, which helps in the elimination of repeating 
data set from the sequence file, ultimately it results into the 
formation of two files with one containing contig sequence 
whereas in other the single tone sequence. As summary 


discussed below:- 


Numbers of Contigs: 2233 
Number of Single tone: 2305 


International Journal of Advanced Research in Biotechnology and Nanobiotechnology 


Volume I, Issue I, July 2020 


ig. 49.2% 50.7% 


Singl 


Figure 3: Distribution of single tone & contig in Cinnamon. 


BLAST 

BLAST was carriedout by BLAST nucleotide analysis; all the 
2233 Contig and 2305 Single tone sequences of Cinnamon 
were BLAST to analyze the putative function of the sequence. 
On the basis of their appropriate match, all SSR loci were 
divided into three groups; 
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Figure 4: Biological distribution of contig and single tone. 


Biological Function: In biological functions the genes acquire 
all vital processes like metabolism, photosynthesis, cell 
signalling, environmental related factors, etc. In this analysis 
we have got 1207 total sequences from Contig and Single tone 
from the Cap3 software there were found to be 9 Ribosomal 
RNA gene functionality, 59 Ribosomal Protein, 11 Protein 
mRNA, 333 chloroplastic & 186 mitochondrial proteins, 5 
histone, 27 enzyme, 30 complete genome, 109 chromosome, 6 


microsatellite, 22 cytoplasm, 6 cell wall structural, 262 
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uncharacterized protein, 36 hypothetical protein, 23 whole 
genome, 115 cytochome & 6 transmembrane protein after the 


analysis. 
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Figure 6: Enzyme distribution of contig and single tone. 


@ Percentage 


Enzymatic Function: Cinnamon is important medicinal plant 


Peete sre eS enor ease ne Se ne used for treating various diseases. It contains more than 2338 


Protein Function: By the help of nucleotide BLAST analysis, ene eee anya ie ayia ee: EOUE Se .3e 


there were in total 170 proteins found in the sequences, among oxidases & peroxidases, 21 isozymes, 38 isomerases, 40 


these 16 were heat shock proteins, 49 transcription proteins, 15 oxygenase, 217 dehydrogenase, 112 kinase, 565 transferases, 


auxin, 2 steroid, 13 splicing, 68 BURP domain protein, 4 122 ligase, 20 hydrolase, 4 enolase, 20 phosphatase, 21 


syntaxin, 4 calcium binding protein and 1 transposon. transaldolase, 2 phosphorylase, 89 hydroxylase, 19 proteasome, 
52 synthetase, 6 decarboxylase, 8 demethylase, 2 ribosome, 20 


ferrodoxin and 53 adenosylhomocysteinase. 
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Primer3 


Primers were constructed based on suitable nucleotide and 


appropriate sequences after BLAST of contigs and single tone. 


Tablel: Characteristics of EST-derived SSRs for Cinnamon. 
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ID ™ GC% | Forward Primer | Reverse Primer Product | Predicted function based on blast Accession 
(oC) size 
DY327125.1 60 50 GCACCATCTTC | TAACATTCCCC | 170 Erythrantheguttatus G-type lectin S-receptor- XM_012999490.1 
GTCCTTCAT AGCTTCGTC like serine/threonine-protein kinase At1 g34300 
(LOC105974395), mRNA 
DY327131.1 60 50 ATCCTCTGGAA | TGATCAAGTG 234 Erythrantheguttatus pyruvate kinase, cytosolic XM_012980626.1 
GAGCTGCAA CGACCTTCAG isozyme (LOC105956735), mRNA 
DY327152.1 60 53 ACTCATCTCGA | CGGCACATCTT | 207 Agastache rugosa chalcone synthase (CHS) JQ314450.1 
CAGCCTCGT TCAGGAGAT mRNA, complete cds 
DY327176.1 60 45 CCTTGGTTTTA | GCCATGGGAT | 249 Ocimumbasilicum germacrene D synthase AY693644.1 
ACGCTGGAA AGAGCAAAAA (GDS) mRNA, complete cds 
DY327188.1 59 50 CGCACTCTTCA | ACTGCTATAA 249 Sesamum indicum (RS)-norcoclaurine 6-O- XM_011094646.2 
TCACTCCAA GCGCCATCGT methyltransferase-like (LOC105173010), 
mRNA 
DY327192.1 59 53 ACTGTTGGACC | CCCAAAGCAA | 155 Sesamum indicum cyclin-dependent kinase D-3 | XM_020695628.1 
ATCCAGAGG GAATCTCAGC (LOC105 166737), transcript variant X2, MRNA 
DY327215.1 60 48 CCACTTCATGC | GAAGCAAAAT | 234 Sesamum indicum 3-phosphoshikimate 1- XM_011092260.2 
TCCCTGTTT TCGGTTGGAA carboxyvinyltransferase 2 (LOC105171218), 
mRNA 
DY327278.1 60 50 ATGAGAAACA | TTCTTCTTCTC | 208 Sesamum indicum protein SRC1 XM_011095843.2 
TGGCGAGGAC_ | AGCGCCTTC (LOC105173924), mRNA 
DY327305.1 60 53 GAAGGACTTC TGCTTAACAGC | 162 Sesamum indicum serine/threonine-protein XM_011088543.2 


45 


International Journal of Advanced Research in Biotechnology and Nanobiotechnology 


Volume I, Issue II, July 2020 


ISSN: 2582-3310 


CCCGATTCTC AACGACCTG kinase PBS1 (LOC105168454), mRNA 
DY327320.1 60 50 AGAGAGAGAT | TTCGTCACTCG | 219 Olea europaea var. sylvestris serine/arginine- XR_002698229.1 
TCGCCGATCA TGCTGAAAG rich splicing factor SR45a-like 
(LOC111369703), transcript variant X3, 
misc RNA 
DY327324.1 60 50 ATCCCATCCAT | CGATCGACAC 155 Sesamum indicum glycosyltransferase family XM_011096181.2 
CCTTCCTTC ATCGAAGCTA protein 64 protein C5 (LOC105174171), mRNA 
DY327360.1 60 50 AAACACAAGG | GCGATGGAGA_ | 180 Sesamum indicum autophagy-related protein XM_011099623.2 
TGCACCACAA | GCCAACTTAG 18f (LOC105176725), mRNA 
DY327460.1 60 50 CCTTGGTTTTA | GCCATGGGAT | 249 Ocimumbasilicum germacrene D synthase AY693644.1 
ACGCTGGAA AGAGCAAAAA (GDS) mRNA, complete cds 
DY327475.1 59 50 CAAGCTGTTCA | AGCGAGCTTC 178 Sesamum indicum acyl-coenzyme A oxidase 3, XM_011101927.2 
ACCCCAAAT CTCATCTCAG peroxisomal (LOC105178460), mRNA 
DY327481.1 60 50 GCAAGGTAGT | GAAGTTGCGC 177 Sesamum indicum 40S ribosomal protein S15 XM_011092832.2 
GCCCAATCAT AAGGCTAAAC (LOC105171649), mRNA 
DY327482.1 59 55 ATCATTTGTGG | CCCTTGACCCC | 199 Erythrantheguttatus serine XM_012989028.1 
AGGGAGTGC CTTAGACTC hydroxymethyltransferase 4 (LOC105964521), 
transcript variant X2, mRNA 
DY327495.1 60 50 AGTGATCTCTT | TGAGAGCAAG | 166 Ocimumbasilicum gamma-cadinene synthase AY693645.1 
TGGGCATGG GGAGGAGAAA (CDS) mRNA, complete cds 
DY327503.1 60 50 GAGGTCGAAG | TCAAATTGGTG | 176 Sesamum indicum serine XM_011085916.2 
ATCCCACAGA | CTCTTGCTG hydroxymethyltransferase 4 (LOC105166533), 
mRNA 
DY327504.1 59 55 ATCATTTGTGG | CCCTTGGACCC | 199 Erythrantheguttatus serine XM_012989028.1 
AGGGAGTGC CTTAGACTC hydroxymethyltransferase 4 (LOC105964521), 
transcript variant X2, mRNA 
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The 3D structures of important protein which were represented 


by contings and single tome sequences developed with the help 


of SWISS Model Expasy. 


Table 2: Proteins Structures predicted on the basis of BLAST 


results. 


PREDICTED: Sesamum indicum (RS)- 
norcoclaurine 6-O-methyltransferase-like 


(LOC105173010), mRNA 


PREDICTED: Sesamum 
indicumchorismate synthase 1, 


chloroplastic (LOC105166625), mRNA 
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PREDICTED: Erythrantheguttatus protein 
BPS1, chloroplastic-like 
(LOC105960562), transcript variant X2, 
mRNA 


PREDICTED: Sesamum indicum 
autophagy-related protein 18f 
(LOC105176725), mRNA 


PREDICTED: Sesamum indicum 
aquaporin TIP 1-1-like (LOC105169946), 
mRNA 
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PREDICTED: Sesamum indicum 
syntaxin-112-like (LOC105178362), 
mRNA 
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CONCLUSION 

In the present, developed EST-SSR will be 
highly useful in genotyping of cinnamon 
accessions with microsatellite markers, that 
can reveal the genetic diversity among 
accessions. These information will help us to 
select better parents with desired genes for 
the progeny to develop new commercial 
variety. It helps to generate novelty of the 
species with higher productivity and quality 
traits towards the sustainable development. 
The development of cinnamon SSR further 
helps in characterization of potential genetic 
makers which are very important for crop 
improvement and in gene mapping. These 
EST-SSR markers play a major role in 
determining the genetic relationship, 
pedigree analysis and genetic background of 


the species. 
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